Overview

Dataset statistics

Number of variables39
Number of observations260601
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory177.0 MiB
Average record size in memory712.0 B

Variable types

Numeric9
Categorical30

Alerts

count_floors_pre_eq is highly overall correlated with height_percentage and 1 other fieldsHigh correlation
height_percentage is highly overall correlated with count_floors_pre_eqHigh correlation
foundation_type is highly overall correlated with roof_type and 4 other fieldsHigh correlation
roof_type is highly overall correlated with foundation_type and 1 other fieldsHigh correlation
ground_floor_type is highly overall correlated with has_superstructure_cement_mortar_brickHigh correlation
other_floor_type is highly overall correlated with count_floors_pre_eq and 1 other fieldsHigh correlation
has_superstructure_mud_mortar_stone is highly overall correlated with foundation_typeHigh correlation
has_superstructure_cement_mortar_brick is highly overall correlated with foundation_type and 1 other fieldsHigh correlation
has_superstructure_rc_non_engineered is highly overall correlated with foundation_typeHigh correlation
has_superstructure_rc_engineered is highly overall correlated with foundation_typeHigh correlation
has_secondary_use is highly overall correlated with has_secondary_use_agriculture and 1 other fieldsHigh correlation
has_secondary_use_agriculture is highly overall correlated with has_secondary_useHigh correlation
has_secondary_use_hotel is highly overall correlated with has_secondary_useHigh correlation
land_surface_condition is highly imbalanced (51.3%)Imbalance
foundation_type is highly imbalanced (60.9%)Imbalance
ground_floor_type is highly imbalanced (59.3%)Imbalance
position is highly imbalanced (50.4%)Imbalance
plan_configuration is highly imbalanced (90.7%)Imbalance
has_superstructure_adobe_mud is highly imbalanced (56.8%)Imbalance
has_superstructure_stone_flag is highly imbalanced (78.4%)Imbalance
has_superstructure_cement_mortar_stone is highly imbalanced (86.9%)Imbalance
has_superstructure_mud_mortar_brick is highly imbalanced (64.1%)Imbalance
has_superstructure_cement_mortar_brick is highly imbalanced (61.5%)Imbalance
has_superstructure_bamboo is highly imbalanced (58.0%)Imbalance
has_superstructure_rc_non_engineered is highly imbalanced (74.6%)Imbalance
has_superstructure_rc_engineered is highly imbalanced (88.2%)Imbalance
has_superstructure_other is highly imbalanced (88.8%)Imbalance
legal_ownership_status is highly imbalanced (86.0%)Imbalance
has_secondary_use_agriculture is highly imbalanced (65.5%)Imbalance
has_secondary_use_hotel is highly imbalanced (78.8%)Imbalance
has_secondary_use_rental is highly imbalanced (93.2%)Imbalance
has_secondary_use_institution is highly imbalanced (98.9%)Imbalance
has_secondary_use_school is highly imbalanced (99.5%)Imbalance
has_secondary_use_industry is highly imbalanced (98.8%)Imbalance
has_secondary_use_health_post is highly imbalanced (99.7%)Imbalance
has_secondary_use_gov_office is highly imbalanced (99.8%)Imbalance
has_secondary_use_use_police is highly imbalanced (99.9%)Imbalance
has_secondary_use_other is highly imbalanced (95.4%)Imbalance
building_id has unique valuesUnique
geo_level_1_id has 4011 (1.5%) zerosZeros
age has 26041 (10.0%) zerosZeros
count_families has 20862 (8.0%) zerosZeros

Reproduction

Analysis started2024-04-11 09:39:57.624146
Analysis finished2024-04-11 09:40:20.433003
Duration22.81 seconds
Software versionpandas-profiling v0.0.dev0
Download configurationconfig.json

Variables

building_id
Real number (ℝ)

Distinct260601
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean525675.48
Minimum4
Maximum1052934
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB
2024-04-11T11:40:20.461376image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile52114
Q1261190
median525757
Q3789762
95-th percentile1000724
Maximum1052934
Range1052930
Interquartile range (IQR)528572

Descriptive statistics

Standard deviation304545
Coefficient of variation (CV)0.57934031
Kurtosis-1.203879
Mean525675.48
Median Absolute Deviation (MAD)264277
Skewness0.0018823567
Sum1.3699156 × 1011
Variance9.2747656 × 1010
MonotonicityNot monotonic
2024-04-11T11:40:20.509835image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
802906 1
 
< 0.1%
680296 1
 
< 0.1%
802531 1
 
< 0.1%
544902 1
 
< 0.1%
823257 1
 
< 0.1%
373540 1
 
< 0.1%
627590 1
 
< 0.1%
421951 1
 
< 0.1%
241191 1
 
< 0.1%
1024699 1
 
< 0.1%
Other values (260591) 260591
> 99.9%
ValueCountFrequency (%)
4 1
< 0.1%
8 1
< 0.1%
12 1
< 0.1%
16 1
< 0.1%
17 1
< 0.1%
25 1
< 0.1%
28 1
< 0.1%
31 1
< 0.1%
34 1
< 0.1%
36 1
< 0.1%
ValueCountFrequency (%)
1052934 1
< 0.1%
1052931 1
< 0.1%
1052929 1
< 0.1%
1052926 1
< 0.1%
1052921 1
< 0.1%
1052915 1
< 0.1%
1052911 1
< 0.1%
1052909 1
< 0.1%
1052908 1
< 0.1%
1052906 1
< 0.1%

geo_level_1_id
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.900353
Minimum0
Maximum30
Zeros4011
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size2.0 MiB
2024-04-11T11:40:20.552810image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median12
Q321
95-th percentile27
Maximum30
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.0336166
Coefficient of variation (CV)0.57794334
Kurtosis-1.2132488
Mean13.900353
Median Absolute Deviation (MAD)6
Skewness0.27253035
Sum3622446
Variance64.538996
MonotonicityNot monotonic
2024-04-11T11:40:20.590455image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
6 24381
 
9.4%
26 22615
 
8.7%
10 22079
 
8.5%
17 21813
 
8.4%
8 19080
 
7.3%
7 18994
 
7.3%
20 17216
 
6.6%
21 14889
 
5.7%
4 14568
 
5.6%
27 12532
 
4.8%
Other values (21) 72434
27.8%
ValueCountFrequency (%)
0 4011
 
1.5%
1 2701
 
1.0%
2 931
 
0.4%
3 7540
 
2.9%
4 14568
5.6%
5 2690
 
1.0%
6 24381
9.4%
7 18994
7.3%
8 19080
7.3%
9 3958
 
1.5%
ValueCountFrequency (%)
30 2686
 
1.0%
29 396
 
0.2%
28 265
 
0.1%
27 12532
4.8%
26 22615
8.7%
25 5624
 
2.2%
24 1310
 
0.5%
23 1121
 
0.4%
22 6252
 
2.4%
21 14889
5.7%

geo_level_2_id
Real number (ℝ)

Distinct1414
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean701.07469
Minimum0
Maximum1427
Zeros38
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.0 MiB
2024-04-11T11:40:20.634010image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile69
Q1350
median702
Q31050
95-th percentile1377
Maximum1427
Range1427
Interquartile range (IQR)700

Descriptive statistics

Standard deviation412.71073
Coefficient of variation (CV)0.58868298
Kurtosis-1.1882325
Mean701.07469
Median Absolute Deviation (MAD)349
Skewness0.028957381
Sum1.8270076 × 108
Variance170330.15
MonotonicityNot monotonic
2024-04-11T11:40:20.677124image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39 4038
 
1.5%
158 2520
 
1.0%
181 2080
 
0.8%
1387 2040
 
0.8%
157 1897
 
0.7%
363 1760
 
0.7%
463 1740
 
0.7%
673 1704
 
0.7%
533 1684
 
0.6%
883 1626
 
0.6%
Other values (1404) 239512
91.9%
ValueCountFrequency (%)
0 38
 
< 0.1%
1 204
0.1%
3 77
 
< 0.1%
4 315
0.1%
5 25
 
< 0.1%
6 2
 
< 0.1%
7 100
 
< 0.1%
8 120
 
< 0.1%
9 333
0.1%
10 354
0.1%
ValueCountFrequency (%)
1427 6
 
< 0.1%
1426 286
0.1%
1425 466
0.2%
1424 7
 
< 0.1%
1423 3
 
< 0.1%
1422 216
0.1%
1421 254
0.1%
1420 10
 
< 0.1%
1419 95
 
< 0.1%
1418 152
 
0.1%

geo_level_3_id
Real number (ℝ)

Distinct11595
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6257.8761
Minimum0
Maximum12567
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.0 MiB
2024-04-11T11:40:20.723505image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile611
Q13073
median6270
Q39412
95-th percentile11927
Maximum12567
Range12567
Interquartile range (IQR)6339

Descriptive statistics

Standard deviation3646.3696
Coefficient of variation (CV)0.58268485
Kurtosis-1.2138965
Mean6257.8761
Median Absolute Deviation (MAD)3171
Skewness0.00039351209
Sum1.6308088 × 109
Variance13296012
MonotonicityNot monotonic
2024-04-11T11:40:20.771680image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
633 651
 
0.2%
9133 647
 
0.2%
621 530
 
0.2%
11246 470
 
0.2%
2005 466
 
0.2%
11440 455
 
0.2%
7723 443
 
0.2%
9229 381
 
0.1%
2452 349
 
0.1%
12258 312
 
0.1%
Other values (11585) 255897
98.2%
ValueCountFrequency (%)
0 2
 
< 0.1%
1 6
 
< 0.1%
3 9
 
< 0.1%
5 14
 
< 0.1%
6 21
 
< 0.1%
7 2
 
< 0.1%
8 31
< 0.1%
9 3
 
< 0.1%
10 1
 
< 0.1%
11 62
< 0.1%
ValueCountFrequency (%)
12567 1
 
< 0.1%
12565 7
 
< 0.1%
12564 6
 
< 0.1%
12563 24
< 0.1%
12562 3
 
< 0.1%
12561 19
< 0.1%
12560 17
 
< 0.1%
12559 6
 
< 0.1%
12558 6
 
< 0.1%
12557 44
< 0.1%

count_floors_pre_eq
Real number (ℝ)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1297232
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB
2024-04-11T11:40:20.810936image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.72766455
Coefficient of variation (CV)0.34167095
Kurtosis2.3225979
Mean2.1297232
Median Absolute Deviation (MAD)0
Skewness0.83411296
Sum555008
Variance0.52949569
MonotonicityNot monotonic
2024-04-11T11:40:20.842473image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 156623
60.1%
3 55617
 
21.3%
1 40441
 
15.5%
4 5424
 
2.1%
5 2246
 
0.9%
6 209
 
0.1%
7 39
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
1 40441
 
15.5%
2 156623
60.1%
3 55617
 
21.3%
4 5424
 
2.1%
5 2246
 
0.9%
6 209
 
0.1%
7 39
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
9 1
 
< 0.1%
8 1
 
< 0.1%
7 39
 
< 0.1%
6 209
 
0.1%
5 2246
 
0.9%
4 5424
 
2.1%
3 55617
 
21.3%
2 156623
60.1%
1 40441
 
15.5%

age
Real number (ℝ)

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.535029
Minimum0
Maximum995
Zeros26041
Zeros (%)10.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB
2024-04-11T11:40:20.885808image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median15
Q330
95-th percentile60
Maximum995
Range995
Interquartile range (IQR)20

Descriptive statistics

Standard deviation73.565937
Coefficient of variation (CV)2.7724084
Kurtosis157.24824
Mean26.535029
Median Absolute Deviation (MAD)10
Skewness12.192494
Sum6915055
Variance5411.947
MonotonicityNot monotonic
2024-04-11T11:40:20.995469image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
10 38896
14.9%
15 36010
13.8%
5 33697
12.9%
20 32182
12.3%
0 26041
10.0%
25 24366
9.3%
30 18028
6.9%
35 10710
 
4.1%
40 10559
 
4.1%
50 7257
 
2.8%
Other values (32) 22855
8.8%
ValueCountFrequency (%)
0 26041
10.0%
5 33697
12.9%
10 38896
14.9%
15 36010
13.8%
20 32182
12.3%
25 24366
9.3%
30 18028
6.9%
35 10710
 
4.1%
40 10559
 
4.1%
45 4711
 
1.8%
ValueCountFrequency (%)
995 1390
0.5%
200 106
 
< 0.1%
195 2
 
< 0.1%
190 3
 
< 0.1%
185 1
 
< 0.1%
180 7
 
< 0.1%
175 5
 
< 0.1%
170 6
 
< 0.1%
165 2
 
< 0.1%
160 6
 
< 0.1%

area_percentage
Real number (ℝ)

Distinct84
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.0180506
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB
2024-04-11T11:40:21.040252image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median7
Q39
95-th percentile16
Maximum100
Range99
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.3922309
Coefficient of variation (CV)0.54779287
Kurtosis30.438258
Mean8.0180506
Median Absolute Deviation (MAD)2
Skewness3.5260823
Sum2089512
Variance19.291693
MonotonicityNot monotonic
2024-04-11T11:40:21.086949image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6 42013
16.1%
7 36752
14.1%
5 32724
12.6%
8 28445
10.9%
9 22199
8.5%
4 19236
7.4%
10 15613
 
6.0%
11 13907
 
5.3%
3 11837
 
4.5%
12 7581
 
2.9%
Other values (74) 30294
11.6%
ValueCountFrequency (%)
1 90
 
< 0.1%
2 3181
 
1.2%
3 11837
 
4.5%
4 19236
7.4%
5 32724
12.6%
6 42013
16.1%
7 36752
14.1%
8 28445
10.9%
9 22199
8.5%
10 15613
 
6.0%
ValueCountFrequency (%)
100 1
 
< 0.1%
96 3
< 0.1%
90 1
 
< 0.1%
86 5
< 0.1%
85 4
< 0.1%
84 3
< 0.1%
83 3
< 0.1%
82 1
 
< 0.1%
80 1
 
< 0.1%
78 1
 
< 0.1%

height_percentage
Real number (ℝ)

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4343652
Minimum2
Maximum32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB
2024-04-11T11:40:21.127513image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q14
median5
Q36
95-th percentile9
Maximum32
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.9184182
Coefficient of variation (CV)0.35301607
Kurtosis14.318526
Mean5.4343652
Median Absolute Deviation (MAD)1
Skewness1.8082618
Sum1416201
Variance3.6803285
MonotonicityNot monotonic
2024-04-11T11:40:21.161736image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
5 78513
30.1%
6 46477
17.8%
4 37763
14.5%
7 35465
13.6%
3 25957
 
10.0%
8 13902
 
5.3%
2 9305
 
3.6%
9 5376
 
2.1%
10 4492
 
1.7%
11 917
 
0.4%
Other values (17) 2434
 
0.9%
ValueCountFrequency (%)
2 9305
 
3.6%
3 25957
 
10.0%
4 37763
14.5%
5 78513
30.1%
6 46477
17.8%
7 35465
13.6%
8 13902
 
5.3%
9 5376
 
2.1%
10 4492
 
1.7%
11 917
 
0.4%
ValueCountFrequency (%)
32 75
< 0.1%
31 1
 
< 0.1%
28 2
 
< 0.1%
26 2
 
< 0.1%
25 3
 
< 0.1%
24 4
 
< 0.1%
23 11
 
< 0.1%
21 13
 
< 0.1%
20 33
< 0.1%
19 7
 
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
t
216757 
n
35528 
o
 
8316

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowt
2nd rowo
3rd rowt
4th rowt
5th rowt

Common Values

ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

Length

2024-04-11T11:40:21.197316image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.238903image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

Most occurring characters

ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

foundation_type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
r
219196 
w
 
15118
u
 
14260
i
 
10579
h
 
1448

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowr
2nd rowr
3rd rowr
4th rowr
5th rowr

Common Values

ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

Length

2024-04-11T11:40:21.269410image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.307248image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

Most occurring characters

ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

roof_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
n
182842 
q
61576 
x
 
16183

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rown
2nd rown
3rd rown
4th rown
5th rown

Common Values

ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

Length

2024-04-11T11:40:21.339001image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.375170image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

Most occurring characters

ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

ground_floor_type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
f
209619 
x
24877 
v
24593 
z
 
1004
m
 
508

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowf
2nd rowx
3rd rowf
4th rowf
5th rowf

Common Values

ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

Length

2024-04-11T11:40:21.405423image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.443954image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

Most occurring characters

ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

other_floor_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
q
165282 
x
43448 
j
39843 
s
 
12028

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowq
2nd rowq
3rd rowx
4th rowx
5th rowx

Common Values

ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

Length

2024-04-11T11:40:21.478082image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.516003image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

Most occurring characters

ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

position
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
s
202090 
t
42896 
j
 
13282
o
 
2333

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowt
2nd rows
3rd rowt
4th rows
5th rows

Common Values

ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%

Length

2024-04-11T11:40:21.548521image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.585454image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%

Most occurring characters

ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
d
250072 
q
 
5692
u
 
3649
s
 
346
c
 
325
Other values (5)
 
517

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowd
2nd rowd
3rd rowd
4th rowd
5th rowd

Common Values

ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Length

2024-04-11T11:40:21.617184image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.659988image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
237500 
1
 
23101

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%

Length

2024-04-11T11:40:21.697653image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.733004image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%

Most occurring characters

ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
1
198561 
0
62040 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%

Length

2024-04-11T11:40:21.762030image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.797186image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%

Most occurring characters

ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
251654 
1
 
8947

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%

Length

2024-04-11T11:40:21.826384image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.861216image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%

Most occurring characters

ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
255849 
1
 
4752

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%

Length

2024-04-11T11:40:21.889345image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.924426image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%

Most occurring characters

ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
242840 
1
 
17761

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

Length

2024-04-11T11:40:21.952657image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:21.986919image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

Most occurring characters

ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

has_superstructure_cement_mortar_brick
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
240986 
1
 
19615

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%

Length

2024-04-11T11:40:22.015842image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.050047image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%

Most occurring characters

ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
194151 
1
66450 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%

Length

2024-04-11T11:40:22.079304image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.115113image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%

Most occurring characters

ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
238447 
1
 
22154

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

Length

2024-04-11T11:40:22.145323image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.180119image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

Most occurring characters

ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

has_superstructure_rc_non_engineered
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
249502 
1
 
11099

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

Length

2024-04-11T11:40:22.210229image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.244574image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

Most occurring characters

ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

has_superstructure_rc_engineered
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
256468 
1
 
4133

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%

Length

2024-04-11T11:40:22.272923image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.372804image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%

Most occurring characters

ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
256696 
1
 
3905

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%

Length

2024-04-11T11:40:22.401381image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.436296image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%

Most occurring characters

ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
v
250939 
a
 
5512
w
 
2677
r
 
1473

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowv
2nd rowv
3rd rowv
4th rowv
5th rowv

Common Values

ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

Length

2024-04-11T11:40:22.464271image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.502082image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

Most occurring characters

ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

count_families
Real number (ℝ)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.98394864
Minimum0
Maximum9
Zeros20862
Zeros (%)8.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB
2024-04-11T11:40:22.532045image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.41838898
Coefficient of variation (CV)0.42521424
Kurtosis17.670943
Mean0.98394864
Median Absolute Deviation (MAD)0
Skewness1.6347579
Sum256418
Variance0.17504934
MonotonicityNot monotonic
2024-04-11T11:40:22.561493image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 226115
86.8%
0 20862
 
8.0%
2 11294
 
4.3%
3 1802
 
0.7%
4 389
 
0.1%
5 104
 
< 0.1%
6 22
 
< 0.1%
7 7
 
< 0.1%
9 4
 
< 0.1%
8 2
 
< 0.1%
ValueCountFrequency (%)
0 20862
 
8.0%
1 226115
86.8%
2 11294
 
4.3%
3 1802
 
0.7%
4 389
 
0.1%
5 104
 
< 0.1%
6 22
 
< 0.1%
7 7
 
< 0.1%
8 2
 
< 0.1%
9 4
 
< 0.1%
ValueCountFrequency (%)
9 4
 
< 0.1%
8 2
 
< 0.1%
7 7
 
< 0.1%
6 22
 
< 0.1%
5 104
 
< 0.1%
4 389
 
0.1%
3 1802
 
0.7%
2 11294
 
4.3%
1 226115
86.8%
0 20862
 
8.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
231445 
1
29156 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

Length

2024-04-11T11:40:22.594246image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.629868image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

Most occurring characters

ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

has_secondary_use_agriculture
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
243824 
1
 
16777

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

Length

2024-04-11T11:40:22.659057image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.694372image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

Most occurring characters

ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

has_secondary_use_hotel
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
251838 
1
 
8763

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%

Length

2024-04-11T11:40:22.722611image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.757141image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%

Most occurring characters

ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
258490 
1
 
2111

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%

Length

2024-04-11T11:40:22.786406image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.820867image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%

Most occurring characters

ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
260356 
1
 
245

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%

Length

2024-04-11T11:40:22.849965image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.883911image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
260507 
1
 
94

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%

Length

2024-04-11T11:40:22.913112image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:22.947473image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
260322 
1
 
279

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%

Length

2024-04-11T11:40:22.975836image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:23.010913image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
260552 
1
 
49

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%

Length

2024-04-11T11:40:23.039899image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:23.075130image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
260563 
1
 
38

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%

Length

2024-04-11T11:40:23.103430image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:23.138345image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
260578 
1
 
23

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%

Length

2024-04-11T11:40:23.166600image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:23.200905image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.4 MiB
0
259267 
1
 
1334

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Length

2024-04-11T11:40:23.229922image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-11T11:40:23.264039image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Interactions

2024-04-11T11:40:18.274801image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:14.370238image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:14.970691image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.436210image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.910750image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.375075image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.902840image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.353358image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.814880image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.321848image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:14.451630image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.021482image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.489464image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.961716image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.426746image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.954576image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.404169image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.867058image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.370905image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:14.507246image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.072636image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.544493image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.012861image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.478464image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.004836image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.454778image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.918620image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.420909image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:14.583399image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.125849image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.598576image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.067290image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.534170image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.055599image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.508504image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.972855image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.470499image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:14.715464image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.179331image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.651419image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.118962image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.585268image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.105926image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.561836image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.025813image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.521616image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:14.770564image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.235509image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.707133image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.174387image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.699721image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.159251image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.617851image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.081431image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.629281image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:14.820245image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.285962image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.758336image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.224832image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.751104image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.207533image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.666453image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.130574image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.675552image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:14.871703image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.335685image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.810148image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.276199image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.801426image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.256184image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.715807image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.179484image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.722905image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:14.922024image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.387463image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:15.861623image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.326267image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:16.852349image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.304936image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:17.767273image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-04-11T11:40:18.228171image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2024-04-11T11:40:23.313875image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentagecount_familiesland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statushas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_other
building_id1.000-0.0030.000-0.0000.0000.000-0.0020.000-0.0010.0000.0000.0000.0000.0000.0000.0030.0000.0000.0000.0000.0040.0000.0000.0000.0000.0000.0000.0000.0000.0000.0030.0000.0000.0000.0000.0000.0000.0000.003
geo_level_1_id-0.0031.000-0.0670.004-0.088-0.0610.038-0.0770.0370.0330.1990.2060.1100.1290.1360.0310.2690.3330.1130.0660.2850.2150.2070.2350.0450.0620.0600.0890.0980.1260.0280.0460.0060.0070.0100.0040.0060.0040.044
geo_level_2_id0.000-0.0671.0000.0010.0430.035-0.0240.037-0.0120.0490.0980.0870.0690.0710.0790.0180.0780.1510.0630.0260.1000.1250.0890.0920.0630.0680.0410.0640.0160.0360.0300.0430.0080.0040.0070.0000.0030.0060.030
geo_level_3_id-0.0000.0040.0011.000-0.016-0.0030.000-0.018-0.0020.0330.0330.0350.0260.0290.0270.0080.0380.0370.0350.0190.0540.0340.0500.0370.0310.0360.0260.0330.0150.0230.0120.0160.0080.0060.0050.0030.0020.0000.015
count_floors_pre_eq0.000-0.0880.043-0.0161.0000.2550.1250.7550.0780.0470.1440.1820.1230.5820.3140.0360.2040.3580.0500.0300.3920.2580.0990.0800.1070.1310.0330.0670.0740.0550.1410.0660.0260.0240.0220.0080.0110.0030.010
age0.000-0.0610.035-0.0030.2551.000-0.0170.1970.0470.0180.0350.0180.0310.0210.1140.0150.0900.0720.0260.0050.1500.0040.0150.0120.0100.0110.0030.0130.0070.0130.0050.0080.0000.0000.0000.0000.0000.0020.006
area_percentage-0.0020.038-0.0240.0000.125-0.0171.0000.2100.0780.0190.1660.2510.1640.1870.0480.0470.0370.2380.0050.0750.0640.2130.0570.0300.1890.2230.0120.0120.1170.0200.1580.1060.0560.0770.0190.0150.0360.0000.015
height_percentage0.000-0.0770.037-0.0180.7550.1970.2101.0000.0630.0190.1670.2350.1180.2990.2110.0200.1520.2630.0230.0320.2550.1440.0660.0620.1740.2310.0110.0380.1280.0580.1970.1060.0500.0410.0140.0180.0210.0000.019
count_families-0.0010.037-0.012-0.0020.0780.0470.0780.0631.0000.0140.0550.0800.0470.0710.0330.0080.0340.0630.0090.0110.0360.0490.0350.0300.0600.0650.0060.0110.1140.0510.0800.0940.0330.0290.0290.0120.0170.0190.019
land_surface_condition0.0000.0330.0490.0330.0470.0180.0190.0190.0141.0000.0320.0390.0450.0370.0330.0210.0210.0800.0460.0130.0640.0600.0470.0320.0120.0280.0350.0210.0090.0060.0130.0080.0040.0040.0000.0000.0000.0020.017
foundation_type0.0000.1990.0980.0330.1440.0350.1660.1670.0550.0321.0000.5480.3560.4110.0980.0570.1050.5530.1440.2030.0760.5110.3410.3030.5060.5430.1150.1500.1730.0550.2560.1850.0680.0380.0220.0190.0260.0080.015
roof_type0.0000.2060.0870.0350.1820.0180.2510.2350.0800.0390.5481.0000.4720.5220.1260.0630.0730.4380.0430.0840.0360.4200.1420.0940.4460.4670.0200.0290.1600.0600.2330.1880.0650.0320.0160.0170.0220.0080.010
ground_floor_type0.0000.1100.0690.0260.1230.0310.1640.1180.0470.0450.3560.4721.0000.3640.0800.0590.0820.4990.1300.1530.0500.5860.1020.0830.3640.3630.0280.0320.1550.0680.2490.1620.0600.0370.0250.0180.0200.0060.039
other_floor_type0.0000.1290.0710.0290.5820.0210.1870.2990.0710.0370.4110.5220.3641.0000.1130.0640.0910.4510.1290.0970.0380.4430.1610.0700.3880.4200.0370.0650.1820.0650.2610.1960.0750.0440.0230.0190.0260.0050.021
position0.0000.1360.0790.0270.3140.1140.0480.2110.0330.0330.0980.1260.0800.1131.0000.0270.1940.2830.0210.0290.3500.1190.0540.0560.0890.0950.0000.0300.1180.0340.2000.0600.0150.0070.0120.0100.0080.0000.008
plan_configuration0.0030.0310.0180.0080.0360.0150.0470.0200.0080.0210.0570.0630.0590.0640.0271.0000.0270.1210.0160.0260.0440.1060.0280.0230.0490.0420.0190.0140.0300.0170.0490.0280.0120.0320.0050.0000.0000.0120.000
has_superstructure_adobe_mud0.0000.2690.0780.0380.2040.0900.0370.1520.0340.0210.1050.0730.0820.0910.1940.0271.0000.3070.0070.0140.3150.0370.0120.0110.0370.0370.0570.0510.0130.0030.0120.0030.0040.0000.0000.0020.0010.0000.010
has_superstructure_mud_mortar_stone0.0000.3330.1510.0370.3580.0720.2380.2630.0630.0800.5530.4380.4990.4510.2830.1210.3071.0000.0340.1040.3760.4710.0400.0550.2220.2240.0420.1470.0870.0580.1590.1180.0360.0230.0250.0080.0110.0020.005
has_superstructure_stone_flag0.0000.1130.0630.0350.0500.0260.0050.0230.0090.0460.1440.0430.1300.1290.0210.0160.0070.0341.0000.0370.0330.0440.1250.0780.0080.0210.0660.0100.0000.0100.0090.0110.0000.0000.0030.0000.0010.0000.000
has_superstructure_cement_mortar_stone0.0000.0660.0260.0190.0300.0050.0750.0320.0110.0130.2030.0840.1530.0970.0290.0260.0140.1040.0371.0000.0000.0790.0140.0030.0760.0250.0120.0120.0420.0160.0720.0340.0070.0050.0060.0030.0110.0030.014
has_superstructure_mud_mortar_brick0.0040.2850.1000.0540.3920.1500.0640.2550.0360.0640.0760.0360.0500.0380.3500.0440.3150.3760.0330.0001.0000.0310.0000.0000.0290.0260.0260.0350.0100.0390.0250.0180.0010.0000.0110.0000.0020.0000.004
has_superstructure_cement_mortar_brick0.0000.2150.1250.0340.2580.0040.2130.1440.0490.0600.5110.4200.5860.4430.1190.1060.0370.4710.0440.0790.0311.0000.0590.0550.1390.1210.0060.0780.0770.0540.1390.1090.0320.0190.0260.0080.0070.0040.000
has_superstructure_timber0.0000.2070.0890.0500.0990.0150.0570.0660.0350.0470.3410.1420.1020.1610.0540.0280.0120.0400.1250.0140.0000.0591.0000.4380.0270.0690.1040.1050.0230.0030.0280.0260.0050.0030.0000.0040.0030.0010.014
has_superstructure_bamboo0.0000.2350.0920.0370.0800.0120.0300.0620.0300.0320.3030.0940.0830.0700.0560.0230.0110.0550.0780.0030.0000.0550.4381.0000.0200.0370.1170.0870.0220.0040.0310.0190.0040.0030.0020.0030.0000.0010.008
has_superstructure_rc_non_engineered0.0000.0450.0630.0310.1070.0100.1890.1740.0600.0120.5060.4460.3640.3880.0890.0490.0370.2220.0080.0760.0290.1390.0270.0201.0000.0120.0180.0080.1080.0230.1580.1030.0360.0200.0150.0070.0040.0000.000
has_superstructure_rc_engineered0.0000.0620.0680.0360.1310.0110.2230.2310.0650.0280.5430.4670.3630.4200.0950.0420.0370.2240.0210.0250.0260.1210.0690.0370.0121.0000.0100.0130.1040.0290.1400.1310.0500.0240.0040.0100.0300.0030.008
has_superstructure_other0.0000.0600.0410.0260.0330.0030.0120.0110.0060.0350.1150.0200.0280.0370.0000.0190.0570.0420.0660.0120.0260.0060.1040.1170.0180.0101.0000.0200.0000.0060.0070.0000.0010.0000.0000.0000.0000.0000.006
legal_ownership_status0.0000.0890.0640.0330.0670.0130.0120.0380.0110.0210.1500.0290.0320.0650.0300.0140.0510.1470.0100.0120.0350.0780.1050.0870.0080.0130.0201.0000.0250.0120.0420.0050.0000.0080.0040.0000.0000.0000.016
has_secondary_use0.0000.0980.0160.0150.0740.0070.1170.1280.1140.0090.1730.1600.1550.1820.1180.0300.0130.0870.0000.0420.0100.0770.0230.0220.1080.1040.0000.0251.0000.7390.5260.2550.0860.0530.0920.0380.0330.0260.202
has_secondary_use_agriculture0.0000.1260.0360.0230.0550.0130.0200.0580.0510.0060.0550.0600.0680.0650.0340.0170.0030.0580.0100.0160.0390.0540.0030.0040.0230.0290.0060.0120.7391.0000.0490.0240.0080.0040.0080.0020.0020.0000.085
has_secondary_use_hotel0.0030.0280.0300.0120.1410.0050.1580.1970.0800.0130.2560.2330.2490.2610.2000.0490.0120.1590.0090.0720.0250.1390.0280.0310.1580.1400.0070.0420.5260.0491.0000.0170.0050.0020.0050.0000.0000.0000.003
has_secondary_use_rental0.0000.0460.0430.0160.0660.0080.1060.1060.0940.0080.1850.1880.1620.1960.0600.0280.0030.1180.0110.0340.0180.1090.0260.0190.1030.1310.0000.0050.2550.0240.0171.0000.0010.0000.0010.0000.0000.0000.001
has_secondary_use_institution0.0000.0060.0080.0080.0260.0000.0560.0500.0330.0040.0680.0650.0600.0750.0150.0120.0040.0360.0000.0070.0010.0320.0050.0040.0360.0500.0010.0000.0860.0080.0050.0011.0000.0000.0000.0000.0000.0000.003
has_secondary_use_school0.0000.0070.0040.0060.0240.0000.0770.0410.0290.0040.0380.0320.0370.0440.0070.0320.0000.0230.0000.0050.0000.0190.0030.0030.0200.0240.0000.0080.0530.0040.0020.0000.0001.0000.0000.0000.0000.0000.000
has_secondary_use_industry0.0000.0100.0070.0050.0220.0000.0190.0140.0290.0000.0220.0160.0250.0230.0120.0050.0000.0250.0030.0060.0110.0260.0000.0020.0150.0040.0000.0040.0920.0080.0050.0010.0000.0001.0000.0000.0000.0000.003
has_secondary_use_health_post0.0000.0040.0000.0030.0080.0000.0150.0180.0120.0000.0190.0170.0180.0190.0100.0000.0020.0080.0000.0030.0000.0080.0040.0030.0070.0100.0000.0000.0380.0020.0000.0000.0000.0000.0001.0000.0000.0000.000
has_secondary_use_gov_office0.0000.0060.0030.0020.0110.0000.0360.0210.0170.0000.0260.0220.0200.0260.0080.0000.0010.0110.0010.0110.0020.0070.0030.0000.0040.0300.0000.0000.0330.0020.0000.0000.0000.0000.0000.0001.0000.0000.000
has_secondary_use_use_police0.0000.0040.0060.0000.0030.0020.0000.0000.0190.0020.0080.0080.0060.0050.0000.0120.0000.0020.0000.0030.0000.0040.0010.0010.0000.0030.0000.0000.0260.0000.0000.0000.0000.0000.0000.0000.0001.0000.000
has_secondary_use_other0.0030.0440.0300.0150.0100.0060.0150.0190.0190.0170.0150.0100.0390.0210.0080.0000.0100.0050.0000.0140.0040.0000.0140.0080.0000.0080.0060.0160.2020.0850.0030.0010.0030.0000.0030.0000.0000.0001.000

Missing values

2024-04-11T11:40:19.007586image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-11T11:40:19.653660image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_other
080290664871219823065trnfqtd11000000000v100000000000
1288308900281221087ornxqsd01000000000v100000000000
29494721363897321055trnfxtd01000000000v100000000000
3590882224181069421065trnfxsd01000011000v100000000000
420194411131148833089trnfxsd10000000000v100000000000
53330208558608921095trnfqsd01000000000v111000000000
672845194751206622534nrnxqsd01000000000v100000000000
747551520323122362086twqvxsu00000110000v100000000000
84411260757721921586trqfqsd01000010000v100000000000
99895002688699410134tinvjsd00000100000v100000000000
building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_other
26059156080520368598012553nrnfjsd01000000000v111000000000
260592207683101382190322555trnfqsd01000010000v100000000000
2605932264218767861325135trnfqsd01000000000v111000000000
260594159555271811537601312trnfxjd00001000000v100000000000
2605958270128268471822085trnfqsd01000000000v100000000000
260596688636251335162115563nrnfjsq01000000000v100000000000
2605976694851771520602065trnfqsd01000000000v100000000000
2605986025121751816335567trqfqsd01000000000v100000000000
26059915140926391851210146trxvsjd00000100000v100000000000
260600747594219910131076nrnfqjd01000000000v300000000000